We present a scalable method for detecting objects and estimating their 3Dposes in RGB-D data. To this end, we rely on an efficient representation ofobject views and employ hashing techniques to match these views against theinput frame in a scalable way. While a similar approach already exists for 2Ddetection, we show how to extend it to estimate the 3D pose of the detectedobjects. In particular, we explore different hashing strategies and identifythe one which is more suitable to our problem. We show empirically that thecomplexity of our method is sublinear with the number of objects and we enabledetection and pose estimation of many 3D objects with high accuracy whileoutperforming the state-of-the-art in terms of runtime.
展开▼